智能论文笔记

A Method For Automated Drone Viewpoints to Support Remote Robot Manipulation

Emmanuel Senft , Michael Hagenow , Pragathi Praveena , Robert Radwin , Michael Zinn , Michael Gleicher , Bilge Mutlu

分类：机器人

2022-08-08

无人机可以提供最小约束的适应摄像头视图，以支持机器人远程启用。此外，可以自动化无人机视图，以减轻远程运行期间操作员的负担。但是，现有方法并不关注使用无人机作为自动视图提供商的两个重要方面。首先是无人机应如何从工作空间内的一系列质量视点（例如对象的相对侧）中进行选择。第二是如何补偿不可避免的无人机姿势不确定性。在本文中，我们提供了一种非线性优化方法，该方法可通过铰接的操纵器产生有效和适应性的无人机观点，用于远程注射。我们的第一个关键想法是使用稀疏的人类输入输入来在多个自动生成的无人机观点之间切换。我们的第二个关键思想是引入优化目标，以在考虑无人机不确定性以及对观点遮挡和环境碰撞的影响的同时，保持对操纵器的视图。我们在无人机操纵器远程遥控系统中提供了无人机观点方法的实例化。最后，我们在完成普通家庭和工业操作的任务中对方法进行了初步验证。

translated by 谷歌翻译

Registering Articulated Objects With Human-in-the-loop Corrections

Michael Hagenow , Emmanuel Senft , Evan Laske , Kimberly Hambuchen , Terrence Fong , Robert Radwin , Michael Gleicher , Bilge Mutlu , Michael Zinn

分类：机器人

2022-03-11

远程编程机器人执行任务通常依赖于在机器人环境中注册感兴趣的对象。这些任务通常涉及阐明物体，例如打开或关闭阀门。但是，现有的注册对象的人类在循环方法中不考虑发音和对象几何形状的相应影响，这可能导致方法失败。在这项工作中，我们提出了一种方法，其中注册系统尝试使用非线性拟合和迭代性最接近点算法来自动确定用户选择点的对象模型，姿势和表达。当拟合不正确时，操作员可以迭代干预校正，然后系统将重新装置对象。我们介绍了具有反击关节的一种自由度（DOF）对象的拟合程序的实施，并通过用户研究对其进行评估，该用户研究表明，它可以改善用户的性能，在任务和任务负载的时间范围内，易于与手动注册方法相比，使用和有用性。我们还提出了一个示例，该示例将我们的方法集成到一个端到端系统中，以阐明远程阀。

translated by 谷歌翻译

CONFIDANT: A Privacy Controller for Social Robots

Brian Tang , Dakota Sullivan , Bengisu Cagiltay , Varun Chandrasekaran , Kassem Fawaz , Bilge Mutlu

分类：机器人

2022-01-08

由于社会机器人在日常环境中越来越普遍，因此他们将参加对话并适当地管理与他们共享的信息。然而，关于机器人如何适当地辨别信息的敏感性，这几乎都知道，这对人机信任具有重大影响。作为解决此问题的一部分的第一步，我们设计了隐私控制员，知己，用于对话社会机器人，能够使用与对话中的对话中的上下文元数据（例如，情绪，关系，主题）进行模型隐私边界。之后，我们进行了两项众群用户研究。第一项研究（n = 174）重点是，是否被认为是私人/敏感或非私人/非敏感性的各种人类互动情景。我们第一次研究的调查结果用于生成关联规则。我们的第二个研究（n = 95）通过比较使用我们的隐私控制器对基线机器人的机器人来评估人机交互情景中隐私控制器的有效性和准确性，这些机器人对基线机器人没有隐私控制。我们的结果表明，没有隐私控制器的机器人在没有隐私控制器的隐私权，可信度和社会意识中占有于机器人。我们得出结论，隐私控制器在真实的人机对话中的整合可以允许更可靠的机器人。此初始隐私控制员将作为更复杂的解决方案作为基础。

translated by 谷歌翻译

Design and Control of a Novel Variable Stiffness Series Elastic Actuator

Emre Sariyildiz , Rahim Mutlu , Jon Roberts , Chin-Hsing Kuo , Barkan Ugurlu

分类：机器人

2023-01-03

This paper expounds the design and control of a new Variable Stiffness Series Elastic Actuator (VSSEA). It is established by employing a modular mechanical design approach that allows us to effectively optimise the stiffness modulation characteristics and power density of the actuator. The proposed VSSEA possesses the following features: i) no limitation in the work-range of output link, ii) a wide range of stiffness modulation (~20Nm/rad to ~1KNm/rad), iii) low-energy-cost stiffness modulation at equilibrium and non-equilibrium positions, iv) compact design and high torque density (~36Nm/kg), and v) high-speed stiffness modulation (~3000Nm/rad/s). Such features can help boost the safety and performance of many advanced robotic systems, e.g., a cobot that physically interacts with unstructured environments and an exoskeleton that provides physical assistance to human users. These features can also enable us to utilise variable stiffness property to attain various regulation and trajectory tracking control tasks only by employing conventional controllers, eliminating the need for synthesising complex motion control systems in compliant actuation. To this end, it is experimentally demonstrated that the proposed VSSEA is capable of precisely tracking desired position and force control references through the use of conventional Proportional-Integral-Derivative (PID) controllers.

translated by 谷歌翻译

TargetCall: Eliminating the Wasted Computation in Basecalling via Pre-Basecalling Filtering

Meryem Banu Cavlak , Gagandeep Singh , Mohammed Alser , Can Firtina , Joël Lindegger , Mohammad Sadrosadati , Nika Mansouri Ghiasi , Can Alkan , Onur Mutlu

分类：人工智能 | 机器学习

2022-12-09

Basecalling is an essential step in nanopore sequencing analysis where the raw signals of nanopore sequencers are converted into nucleotide sequences, i.e., reads. State-of-the-art basecallers employ complex deep learning models to achieve high basecalling accuracy. This makes basecalling computationally-inefficient and memory-hungry; bottlenecking the entire genome analysis pipeline. However, for many applications, the majority of reads do no match the reference genome of interest (i.e., target reference) and thus are discarded in later steps in the genomics pipeline, wasting the basecalling computation. To overcome this issue, we propose TargetCall, the first fast and widely-applicable pre-basecalling filter to eliminate the wasted computation in basecalling. TargetCall's key idea is to discard reads that will not match the target reference (i.e., off-target reads) prior to basecalling. TargetCall consists of two main components: (1) LightCall, a lightweight neural network basecaller that produces noisy reads; and (2) Similarity Check, which labels each of these noisy reads as on-target or off-target by matching them to the target reference. TargetCall filters out all off-target reads before basecalling; and the highly-accurate but slow basecalling is performed only on the raw signals whose noisy reads are labeled as on-target. Our thorough experimental evaluations using both real and simulated data show that TargetCall 1) improves the end-to-end basecalling performance of the state-of-the-art basecaller by 3.31x while maintaining high (98.88%) sensitivity in keeping on-target reads, 2) maintains high accuracy in downstream analysis, 3) precisely filters out up to 94.71% of off-target reads, and 4) achieves better performance, sensitivity, and generality compared to prior works. We freely open-source TargetCall at https://github.com/CMU-SAFARI/TargetCall.

translated by 谷歌翻译

NEON: Enabling Efficient Support for Nonlinear Operations in Resistive RAM-based Neural Network Accelerators

Aditya Manglik , Minesh Patel , Haiyu Mao , Behzad Salami , Jisung Park , Lois Orosa , Onur Mutlu

分类：人工智能 | 机器学习 | 神经与进化计算

2022-11-10

Resistive Random-Access Memory (RRAM) is well-suited to accelerate neural network (NN) workloads as RRAM-based Processing-in-Memory (PIM) architectures natively support highly-parallel multiply-accumulate (MAC) operations that form the backbone of most NN workloads. Unfortunately, NN workloads such as transformers require support for non-MAC operations (e.g., softmax) that RRAM cannot provide natively. Consequently, state-of-the-art works either integrate additional digital logic circuits to support the non-MAC operations or offload the non-MAC operations to CPU/GPU, resulting in significant performance and energy efficiency overheads due to data movement. In this work, we propose NEON, a novel compiler optimization to enable the end-to-end execution of the NN workload in RRAM. The key idea of NEON is to transform each non-MAC operation into a lightweight yet highly-accurate neural network. Utilizing neural networks to approximate the non-MAC operations provides two advantages: 1) We can exploit the key strength of RRAM, i.e., highly-parallel MAC operation, to flexibly and efficiently execute non-MAC operations in memory. 2) We can simplify RRAM's microarchitecture by eliminating the additional digital logic circuits while reducing the data movement overheads. Acceleration of the non-MAC operations in memory enables NEON to achieve a 2.28x speedup compared to an idealized digital logic-based RRAM. We analyze the trade-offs associated with the transformation and demonstrate feasible use cases for NEON across different substrates.

translated by 谷歌翻译

Hyperbolic Centroid Calculations for Text Classification

Aydın Gerek , Cüneyt Ferahlar , Bilge Şipal Sert , Mehmet Can Yüney , Onur Taşdemir , Zeynep Billur Kalafat , Mert Kelkit , Murat Can Ganiz

分类：自然语言处理

2022-11-08

A new development in NLP is the construction of hyperbolic word embeddings. As opposed to their Euclidean counterparts, hyperbolic embeddings are represented not by vectors, but by points in hyperbolic space. This makes the most common basic scheme for constructing document representations, namely the averaging of word vectors, meaningless in the hyperbolic setting. We reinterpret the vector mean as the centroid of the points represented by the vectors, and investigate various hyperbolic centroid schemes and their effectiveness at text classification.

translated by 谷歌翻译

Accelerating Neural Network Inference with Processing-in-DRAM: From the Edge to the Cloud

Geraldo F. Oliveira , Juan Gómez-Luna , Saugata Ghose , Amirali Boroumand , Onur Mutlu

分类：机器学习

2022-09-19

神经网络（NNS）的重要性和复杂性正在增长。神经网络的性能（和能源效率）可以通过计算或内存资源约束。在内存阵列附近或内部放置计算的内存处理（PIM）范式是加速内存绑定的NNS的可行解决方案。但是，PIM体系结构的形式各不相同，其中不同的PIM方法导致不同的权衡。我们的目标是分析基于NN的性能和能源效率的基于DRAM的PIM架构。为此，我们分析了三个最先进的PIM架构：（1）UPMEM，将处理器和DRAM阵列集成到一个2D芯片中；（2）Mensa，是针对边缘设备量身定制的基于3D堆栈的PIM架构；（3）Simdram，它使用DRAM的模拟原理来执行位序列操作。我们的分析表明，PIM极大地受益于内存的NNS：（1）UPMEM在GPU需要内存过度按要求的通用矩阵 - 矢量乘数内核时提供23x高端GPU的性能；（2）Mensa在Google Edge TPU上提高了3.0倍和3.1倍的能源效率和吞吐量，用于24个Google Edge NN型号；（3）SIMDRAM在三个二进制NNS中以16.7倍/1.4倍的速度优于CPU/GPU。我们得出的结论是，由于固有的建筑设计选择，NN模型的理想PIM体系结构取决于模型的独特属性。

translated by 谷歌翻译

Actor Prioritized Experience Replay

Baturay Saglam , Furkan B. Mutlu , Dogan C. Cicek , Suleyman S. Kozat

分类：机器学习 | 人工智能

2022-09-01

一种被称为优先体验重播（PER）的广泛研究的深钢筋学习（RL）技术使代理可以从与其时间差异（TD）误差成正比的过渡中学习。尽管已经表明，PER是离散作用域中深度RL方法总体性能的最关键组成部分之一，但许多经验研究表明，在连续控制中，它的表现非常低于参与者 - 批评算法。从理论上讲，我们表明，无法有效地通过具有较大TD错误的过渡对演员网络进行训练。结果，在Q网络下计算的近似策略梯度与在最佳Q功能下计算的实际梯度不同。在此激励的基础上，我们引入了一种新颖的经验重播抽样框架，用于演员批评方法，该框架还认为稳定性和最新发现的问题是Per的经验表现不佳。引入的算法提出了对演员和评论家网络的有效和高效培训的改进的新分支。一系列广泛的实验验证了我们的理论主张，并证明了引入的方法显着优于竞争方法，并获得了与标准的非政策参与者 - 批评算法相比，获得最先进的结果。

translated by 谷歌翻译

Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based Off-Chip Load Prediction

Rahul Bera , Konstantinos Kanellopoulos , Shankar Balachandran , David Novo , Ataberk Olgun , Mohammad Sadrosadati , Onur Mutlu

分类：机器学习

2022-09-01

长期负载请求继续限制高性能处理器的性能。为了提高处理器的潜伏能力，建筑师主要依赖两种关键技术：复杂的数据预脱水和较大的芯片固定缓存。在这项工作中，我们表明：1）即使是先进的先进预摘要，也只能预测一半的外芯片负载请求，平均在广泛的工作负载中，而2）由于尺寸的增加，并且片上缓存的复杂性，花片载荷请求的延迟的很大一部分用于访问片上缓存层次结构。这项工作的目的是通过从其关键路径上删除片上缓存访问延迟来加速片外负载请求。为此，我们提出了一种称为爱马仕（Hermes）的新技术，其关键想法是：1）准确预测哪些负载请求可能会偏离芯片，2）猜测预测的芯片外载荷直接从主芯片负载所需的数据内存，同时也同时访问此类负载的高速缓存层次结构。为了启用爱马仕，我们开发了一种新的轻巧，基于智障的外芯片加载预测技术，该技术学会使用多个程序功能（例如，程序计数器的序列）来识别芯片外负载请求。对于每个负载请求，预测器都会观察一组程序功能，以预测负载是否会外芯片。如果预计负载将放置芯片，Hermes一旦生成负载的物理地址，就会直接向内存控制器发出投机请求。如果预测是正确的，则负载最终会错过缓存层次结构，并等待正在进行的投机请求完成，从而将芯片上缓存层次结构访问延迟隐藏在离芯片外负载的关键路径中。我们的评估表明，爱马仕显着提高了最先进的基线的性能。我们开源爱马仕。

translated by 谷歌翻译